Search CORE

arXiv.org e-Print Archive

Characteristics of transposable element exonization within human and mouse

Author: A Athanasiadis
A Corvelo
A Gerber
A Goren
A Levy
A Magen
A Nekrutenko
A Resch
AFA Smit
Agnes Hotz-Wagenblatt
B Giardine
B Mersch
BR Graveley
Britta Mersch
C Liu
D Karolchik
D Labuda
DD Kim
E Kim
ES Lander
EY Levanon
G Ast
G Lev-Maor
G Lev-Maor
Gil Ast
H Xie
Ilya Ruvinsky
J Hull
J Jurka
J Jurka
JO Kriegs
JO Yang
JP Nemes
K Nakabayashi
KP Kister
L Lin
L Lin
M Amit
M Blow
M Krull
M Moller-Krull
M Roy
M Sironi
M Sironi
MA Batzer
MD Koob
N Gal-Mark
N Gal-Mark
N Sela
NH Gehring
Noa Sela
O Ram
P Deininger
PL Deininger
R Cordaux
R Sorek
R Sorek
R Sorek
RA Gibbs
RE Mills
RH Waterston
RM Kuhn
RT Hillman
S He
S Schwartz
SK Ng
SS Singer
ST Sherry
T Kwan
T Kwan
VV Kapitonov
W Makalowski
WJ Kent
WL Chen
WS Lo
XH Zhang
Y Xing
YF Chang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/06/2010
Field of study

Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure

Does Selection against Transcriptional Interference Shape Retroelement-Free Regions in Mammalian Genomes?

Author: A Mazo
AE Peaston
AF Smit
AFA Smit
AR Muotri
AV Evsikov
C Conte
C Pittoggi
C Rizzon
C Simons
C Simons
C Tribioli
CB Kimmel
D Duboule
D Karolchik
DP Bartel
DV Babushok
E Wienholds
EM Ostertag
ES Lander
Eske Willerslev
ET Prak
I Arkhipova
IN Chesnokov
J Meunier
JA van den Hurk
JC Szucsik
JL Garcia-Perez
JT Eppig
JT Eppig
K Theiler
M Sironi
M Speek
MA Batzer
MJ Gardner
P Medstrand
P Medstrand
P Medstrand
P Nigumann
R Beraldi
R Morgan
RH Waterston
Rodolfo Aramayo
RS Poethig
RW Carthew
S Boissinot
S Griffiths-Jones
S Kubo
T Iimura
T Wicker
TJ Hubbard
Tobias Mourier
WJ Kent
WM Chu
Y Zhao
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic regions being intolerant to insertions of retroelements. The inadvertent transcriptional activity of retroelements may affect neighbouring genes, which in turn could be detrimental to an organism. We speculate that such retroelement transcription, or transcriptional interference, is a contributing factor in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs) to be able to display a high degree of transcriptional interference. In contrast, we expect short interspersed elements (SINEs) to display very low levels of transcriptional interference. We find that genomic regions devoid of long interspersed elements (LINEs) are enriched for protein-coding genes, but that this is not the case for regions devoid of short interspersed elements (SINEs). This is expected if genes are subject to selection against transcriptional interference. We do not find microRNAs to be associated with genomic regions devoid of either SINEs or LINEs. We further observe an increased relative activity of genes overlapping LINE-free regions during early embryogenesis, where activity of LINEs has been identified previously. CONCLUSIONS/SIGNIFICANCE: Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome

Copenhagen University Research Information System

Mosquitoes LTR Retrotransposons: A Deeper View into the Genomic Sequence of Culex quinquefasciatus

A set of 67 novel LTR-retrotransposon has been identified by in silico analyses of the Culex quinquefasciatus genome using the LTR_STRUC program. The phylogenetic analysis shows that 29 novel and putatively functional LTR-retrotransposons detected belong to the Ty3/gypsy group. Our results demonstrate that, by considering only families containing potentially autonomous LTR-retrotransposons, they account for about 1% of the genome of C. quinquefasciatus. In previous studies it has been estimated that 29% of the genome of C. quinquefasciatus is occupied by mobile genetic elements

CiteSeerX

Archivio istituzionale della ricerca - Università di Bari

FigShare

Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

Author: A Nekrutenko
A. P. Jason de Koning
AFA Smit
AL Price
AR Quinlan
C Feschotte
DA Ray
David D. Pollock
E Lerat
EE Eichler
EF Kirkness
G Achaz
G Benson
G Lunter
Gregory P. Copenhaver
H Quesneville
HH Kazazian Jr
J Brosius
J Jurka
J Jurka
J Jurka
JS Mattick
JU Pontius
K Lindblad-Toh
M Pheasant
MA Batzer
Mark A. Batzer
MC Frith
R Li
RC Edgar
RM Kuhn
S Karlin
S Kurtz
SF Altschul
TA Castoe
Todd A. Castoe
TS Mikkelsen
W Gu
Wanjun Gu
WC Warren
Z Bao
Publication venue: Public Library of Science
Publication date: 01/12/2011
Field of study

Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed

Louisiana State University

Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum

Author: A Cuadrado
A D'Hont
A D’hont
A Mortazavi
ACE Darling
AFA Smit
AFA Smit
AH Paterson
Anupma Sharma
BC Thomas
BL Cantarel
C Asnaghi
C Asnaghi
C Asnaghi
D Swarbreck
D Wang
E Talamas
F DM
F Tajima
G Blanc
G Bremer
H Ozkan
J Daniels
J Jurka
J Ma
J Zhang
J Zhang
JA Chapman
JA Tate
JH Daugrois
Jianping Wang
Jisen Zhang
JL Bennetzen
JP Tomkins
JP Vogel
JY Hoarau
K Ilic
L Cunff Le
L Fu
L Grivet
LE Flagel
Leiting Li
Lin Zhu
M Freeling
M Lynch
M Shimazaki
M Suyama
MA Larkin
MJ Sullivan
MM Fitch
N Berding
N Chantret
N Jannoo
NV Nair
O Garsmeur
P SanMiguel
Q Yu
Qingyi Yu
R Ming
R Ming
Ray Ming
RC Edgar
RJA Buggs
S Ohno
S Ouyang
S Ouyang
S Price
S Price
S Schwartz
T Flutre
TJ Carver
VE Prince
VN Babenko
W Li
WL Burnquist
Xingtan Zhang
YH Lu
Youqiang Chen
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Elusive Copy Number Variation in the Mouse Genome

Author: AB Olshen
AFA Smit
Amarjit Bhomra
AR Quinlan
Avigail Agam
B Ewing
B Yalcin
BE Stranger
Binnaz Yalcin
C Curtis
Caleb Webber
Christopher Holmes
CN Henrichsen
D Gordon
D St Clair
Daniel J. Kliebenstein
DE Watkins-Chow
DF Conrad
DF Conrad
DP Locke
DQ Nguyen
EJ Hollox
G Cutler
GH Perry
GJ Huang
Jonathan Flint
JP Schouten
JP Schouten
JR Lupski
KA Frazer
LD Orozco
LM Boyden
M Kubista
Matthew Cubin
NM Maas
P Cahan
P Cahan
P Hupe
R Redon
Richard Mott
S Rozen
SA McCarroll
SW Scherer
TA Graubert
TS Price
WB Breunis
X She
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Array comparative genomic hybridization (aCGH) to detect copy number variants (CNVs) in mammalian genomes has led to a growing awareness of the potential importance of this category of sequence variation as a cause of phenotypic variation. Yet there are large discrepancies between studies, so that the extent of the genome affected by CNVs is unknown. We combined molecular and aCGH analyses of CNVs in inbred mouse strains to investigate this question.Using a 2.1 million probe array we identified 1,477 deletions and 499 gains in 7 inbred mouse strains. Molecular characterization indicated that approximately one third of the CNVs detected by the array were false positives and we estimate the false negative rate to be more than 50%. We show that low concordance between studies is largely due to the molecular nature of CNVs, many of which consist of a series of smaller deletions and gains interspersed by regions where the DNA copy number is normal.Our results indicate that CNVs detected by arrays may be the coincidental co-localization of smaller CNVs, whose presence is more likely to perturb an aCGH hybridization profile than the effect of an isolated, small, copy number alteration. Our findings help explain the hitherto unexplored discrepancies between array-based studies of copy number variation in the mouse genome

Online Research @ Cardiff

HAL-Inserm

Oxford University Research Archive

The First Sequenced Carnivore Genome Shows Complex Host-Endogenous Retrovirus Relationships

Author: A Katzourakis
AFA Smit
AL Roca
AM Barrio
C Bartholomew
CA Wilson
CM Romano
D Karolchik
DF Conrad
DH Huson
EK Karlsson
Erik Bongcam-Rudloff
ES Lander
Farid Benachenhou
GE Liu
GO Sperber
Göran Andersson
Göran O. Sperber
Hiroaki Matsunami
J Blomberg
JD Thompson
JE Clough
JN Volff
Jonas Blomberg
K Beemon
K Lindblad-Toh
L Benit
LN van de Lagemaat
LN van de Lagemaat
Marie Ekerljung
N Saitou
P Jern
P Jern
P Jern
Patric Jern
PJ Cock
PL Deininger
PM Sharp
PN Tsichlis
RC Edgar
RE Tarlinton
RM Kuhn
V Blikstad
WH Chen
WJ Kent
Álvaro Martínez Barrio
Publication venue: Public Library of Science
Publication date: 12/05/2011
Field of study

Host-retrovirus interactions influence the genomic landscape and have contributed substantially to mammalian genome evolution. To gain further insights, we analyzed a female boxer (Canis familiaris) genome for complexity and integration pattern of canine endogenous retroviruses (CfERV). Intriguingly, the first such in-depth analysis of a carnivore species identified 407 CfERV proviruses that represent only 0.15% of the dog genome. In comparison, the same detection criteria identified about six times more HERV proviruses in the human genome that has been estimated to contain a total of 8% retroviral DNA including solitary LTRs. These observed differences in man and dog are likely due to different mechanisms to purge, restrict and protect their genomes against retroviruses. A novel group of gammaretrovirus-like CfERV with high similarity to HERV-Fc1 was found to have potential for active retrotransposition and possibly lateral transmissions between dog and human as a result of close interactions during at least 10.000 years. The CfERV integration landscape showed a non-uniform intra- and inter-chromosomal distribution. Like in other species, different densities of ERVs were observed. Some chromosomal regions were essentially devoid of CfERVs whereas other regions had large numbers of integrations in agreement with distinct selective pressures at different loci. Most CfERVs were integrated in antisense orientation within 100 kb from annotated protein-coding genes. This integration pattern provides evidence for selection against CfERVs in sense orientation relative to chromosomal genes. In conclusion, this ERV analysis of the first carnivorous species supports the notion that different mammals interact distinctively with endogenous retroviruses and suggests that retroviral lateral transmissions between dog and human may have occurred

microPIR: An Integrated Database of MicroRNA Target Sites within Human Promoter Sequences

Author: A Grimson
A Siepel
AFA Smit
BA Janowski
BM Engels
BP Lewis
C Zhang
Chaiwat Bootchai
Chumpol Ngamphiw
D Karolchik
DH Kim
DP Bartel
DP Bartel
E Blanco
F Xiao
G Ruvkun
H Dweep
I. King Jordan
IH Consortium
J Kruger
J Piriyapongsa
JC Carrington
Jittima Piriyapongsa
JM Claverie
LC Li
LD Stein
M Ashburner
M Blanchette
M Hafner
M Hirakawa
M Kanehisa
M Khorshid
PA Fujita
Q Jiang
RF Place
RH Waterston
S Griffiths-Jones
S Nam
S Volinia
SD Hsu
Sissades Tongsima
ST Sherry
ST Younger
ST Younger
V Ambros
V Ambros
W Filipowicz
X Wang
Y Huang
Publication venue: Public Library of Science
Publication date: 16/03/2012
Field of study

Background: microRNAs are generally understood to regulate gene expression through binding to target sequences within 39-UTRs of mRNAs. Therefore, computational prediction of target sites is usually restricted to these gene regions. Recent experimental studies though have suggested that microRNAs may alternatively modulate gene expression by interacting with promoters. A database of potential microRNA target sites in promoters would stimulate research in this field leading to more understanding of complex microRNA regulatory mechanism. Methodology: We developed a database hosting predicted microRNA target sites located within human promoter sequences and their associated genomic features, called microPIR (microRNA-Promoter Interaction Resource). microRNA seed sequences were used to identify perfect complementary matching sequences in the human promoters and the potential target sites were predicted using the RNAhybrid program..15 million target sites were identified which are located within 5000 bp upstream of all human genes, on both sense and antisense strands. The experimentally confirmed argonaute (AGO) binding sites and EST expression data including the sequence conservation across vertebrate species of each predicted target are presented for researchers to appraise the quality of predicted target sites. The microPIR database integrates various annotated genomic sequence databases, e.g. repetitive elements, transcription factor binding sites, CpG islands, and SNPs, offering users the facility to extensively explore relationships among target sites and other genomi

CiteSeerX

Small Deletion Variants Have Stable Breakpoints Commonly Associated with Alu Elements

Author: A Bacolla
Adam J. de Smith
AFA Smit
AJ de Smith
AJ Iafrate
AJ Sharp
Alexandra I. F. Blakemore
BE Stranger
CY Chan
D Karolchik
D Karolchik
DA Hinds
DP Locke
E Eden
E Gonzalez
E Tuzun
EV Linardopoulou
GH Perry
GJ Cost
GM Cooper
Israel Steinfeld
J Sebat
J Sebat
JA Lee
JC Barrett
JO Korbel
K Han
K Lee
KK Wong
Lachlan J. M. Coin
M Dewannieux
M Dewannieux
M Fanciulli
M Krawczak
Michael Lichten
P Scheet
PA Callinan
Philippe Froguel
PM Kim
R Chenna
R Redon
RD Wells
Rob Sladek
Robin G. Walters
S Gonzalez-Barrera
S Rozen
SA McCarroll
SK Sen
TJ Hubbard
Zohar Yakhini
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Copy number variants (CNVs) contribute significantly to human genomic variation, with over 5000 loci reported, covering more than 18% of the euchromatic human genome. Little is known, however, about the origin and stability of variants of different size and complexity. We investigated the breakpoints of 20 small, common deletions, representing a subset of those originally identified by array CGH, using Agilent microarrays, in 50 healthy French Caucasian subjects. By sequencing PCR products amplified using primers designed to span the deleted regions, we determined the exact size and genomic position of the deletions in all affected samples. For each deletion studied, all individuals carrying the deletion share identical upstream and downstream breakpoints at the sequence level, suggesting that the deletion event occurred just once and later became common in the population. This is supported by linkage disequilibrium (LD) analysis, which has revealed that most of the deletions studied are in moderate to strong LD with surrounding SNPs, and have conserved long-range haplotypes. Analysis of the sequences flanking the deletion breakpoints revealed an enrichment of microhomology at the breakpoint junctions. More significantly, we found an enrichment of Alu repeat elements, the overwhelming majority of which intersected deletion breakpoints at their poly-A tails. We found no enrichment of LINE elements or segmental duplications, in contrast to other reports. Sequence analysis revealed enrichment of a conserved motif in the sequences surrounding the deletion breakpoints, although whether this motif has any mechanistic role in the formation of some deletions has yet to be determined. Considered together with existing information on more complex inherited variant regions, and reports of de novo variants associated with autism, these data support the presence of different subgroups of CNV in the genome which may have originated through different mechanisms